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Abstract 

We show that large deviation properties of Erdos-Renyi random 
graphs can be derived from the free energy of the q-state Potts model 
of statistical mechanics. More precisely the Legendre transform of 
the Potts free energy with respect to In q is related to the component 
generating function of the graph ensemble. This generalizes the well- 
known mapping between typical properties of random graphs and 
the g — > 1 limit of the Potts free energy. For exponentially rare 
graphs we explicitly calculate the number of components, the size of 
the giant component, the degree distributions inside and outside the 
giant component, and the distribution of small component sizes. We 
also perform numerical simulations which are in very good agreement 
with our analytical work. Finally we demonstrate how the same 
results can be derived by studying the evolution of random graphs 
under the insertion of new vertices and edges, without recourse to 
the thermodynamics of the Potts model. 

PACS: 02.50.-r, 05.50.-Hq, 75.10.Nr 

1 Introduction 

Random graphs have kept being an issue of tremendous interest in proba- 
bihty and graph theory ever since the seminal work by Erdos and Renyi ^ 
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more than four decades ago. In addition to fixed edge number and fixed 
edge probability distributions also random graphs with constant vertex de- 
gree or power law degree distribution |31I11 have been investigated. Most 
of the efforts devoted to the study of the properties of random graphs have 
taken advantage of the fact that these properties undergo some concentra- 
tion process in the infinite size (number of vertices) limit. For instance, the 
numbers of vertices in the largest component or the number of connected 
components, which are stochastic in nature, become highly concentrated in 
this limit, and with high probability do not differ from their average values. 

For large but finite sizes, properties as the one evoked above obviously 
fluctuate from graph to graph. The understanding of their statistical de- 
viations are important for several problems in statistical physics, e.g. for 
the life-time of metastable states and the extremal properties of models 
defined on random graphs [Hj, as well as in computer science, e.g. for 
information-packet transmission in random networks [HI El j resolution of 
random decision problems with search procedures [30^] and others. Up 
to now apparently little attention has been paid to a quantitative charac- 
terization of large deviations in random graph ensembles I12| . 

The present work is intended to contribute to an improved understand- 
ing of rare fluctuations in random graphs. Our main objective was to devise 
a microscopic "mean- field" approach permitting to handle such rare devi- 
ations in much the same way as for average properties of various similar 
problems, as e.g. bootstrap and rigidity percolation ^21 and spin-glasses 
jl4| . The mean- field approach relies on a statistical stability argument: a 
large graph is not strongly modified when adding an edge and/or a vertex. 
This statement can be translated into some self-consistent equations for the 
average value of physical properties of interest, as e.g. the magnetization 
for a spin system, or the probability of belonging to the fc-core for bootstrap 
percolation. We will show in the present work that a similar self-consistent 
approach can also be successfully used to access large deviations in random 
graphs. 

The main property we focus on throughout this paper is the number of 
connected components of a random graph. As established by Fortuin and 
Kasteleyn |15| , several properties of random graphs with a typical number 
of components can be inferred from the knowledge of the thermodynamics 
of the g-state Potts model on a complete graph for values of q around 1. 
We will show that the thermodynamic properties for general values of q can 
be used to additionally characterize the properties of random graphs with 
an atypical number of components. This allows us to verify the validity of 
our microscopic mean-field approach. 

This paper is organized as follows. In Section 2, we introduce the basic 
definitions and notations for the quantities studied. Section 3 is devoted 
to the derivation of rare graphs properties through the study of the Potts 
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model. We show in Section 4 how these results can be rederived through the 
requirement of the statistical stability of very large atypical graphs against 
the addition of a vertex and its attached edges, or an edge. In Section 5 we 
describe our numerical procedure to simulate large deviation properties of 
random graphs ensembles. Some conclusion is finally proposed in Section 6. 

2 Basic notions 

We begin by fixing some vocabulary. For a detailed and precise account 
on random graphs we refer the reader to the textbook ^^1- A graph G is 
a collection of vertices numbered hy i = \, . . . ,N with edges i ^ j, 

i,j = 1, . . . ,N connecting them. The number of edges is between (for the 
empty graph) and N(N — l)/2 (for the complete graph). A component of a 
graph is a subset of connected vertices which are disconnect from the rest of 
the graph. The size 5 of a component is the number of vertices it contains. 
Hence the empty graph consists of N components of size 1 whereas the 
complete graph is made from a single component of size N. The number of 
components of a graph G is denoted by C (G) . We are generally interested 
in properties of large graphs, N — > oo. 

We will consider random graphs in the sense that an edge between 
two vertices may be present or absent with a certain probability. The 
various joint probabilities to be discussed below will be denoted in the 
form P{xi, X2, ai, 02, ...) with the Xi representing the random variables 
and the denoting the parameters of the distribution. In particular we 
consider random graphs in which each pair of vertices is connected by an 
edge with probability j/N independently of all other pairs of vertices. The 
parameter 7 characterizes the connectivity of the graph. Since each vertex 
establishes edges with probability j/N with all the other — 1 vertices 7 
is in the limit iV — > 00 just the typical degree of a vertex denoted by d* , 
giving the average number of edges emanating from it. 

More precisely, in this limit the degree d of a vertex is a random variable 
obeying a Poisson law with parameter 7, 

P(d;7) = e-^^. (1) 

In particular, P{d = 0;j) = e'"' is the fraction of isolated vertices. Hence 
the average number of components of a random graph of the described type 
is bounded from below by Ne~'' . Note also that the typical degree d* of 
a vertex remains finite for A — > 00. Typical realization of such random 
graphs are therefore sparse. 

The probability P{G;'j, N) of one particular random graph G with A^ 
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vertices and parameter 7 derives from the binomial law, 



L(G) / -f\{2)~LiG) 



N 



(2) 



where L{G) ~ 0{N) denotes the number of edges of graph G. To describe 
the decomposition of a large random graph into its components, it is con- 
venient to introduce the probability P{C; 7, N) of a random graph with A'^ 
vertices to have C components 

P(C;7,A^) = ^P(G;7,A^)<5(C,C(G)), (3) 

G 

where 6{a, b) denotes the Kronecker delta. 

A general observation is that for given 7 and large N the probability 
P(C; 7, A^) gets sharply peaked at some typical value C* of C and the prob- 
abilities for values of C significantly different from C* being exponentially 
small in N. To describe this fact more quantitatively we introduce the 
number of components per vertex c = C /N together with the quantity 

c.(c,7)= lim llnP(C;7,A^). (4) 

N—^OQ I\ 

Clearly a;(c, 7) < and the typical value c* of c has w(c*, 7) = 0. Averages 
with P{G; 7, N) are therefore dominated by graphs with a typical number 
of components. 

The focus of the present paper is on properties of random graphs which 
are atypical with respect to their number of components C. In order to get 
access to the properties of these graphs we introduce the biased probability 
distributions 

P(G;7,<Z,iV) = ^^:^^P(G;7,A^)g^(^\ (5) 
with Z(7, q, N) defined by 

Z(7, q,N) = Y, P{G\ 7, N) g^f^) = ^ P(G; 7, N) . (6) 

G C 

The normalization constant ^(7, g, A^) in ((S)) has hence the meaning of a 
component generating function of P{G;j,N). Contrary to averages with 
P(G; 7, N) those with P(G; 7, q, N) are dominated by graphs with an atyp- 
ical number of components which is fixed implicitly with the parameter q. 
Values of q smaller than 1 shift weight to graphs with few components 
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whereas for q > 1 graphs with many components dominate the distribu- 
tion. The typical case is obviously recovered for q = 1. 

Similar to a;(c, 7) it is convenient to introduce the function 

(^(7,g)= lim llnZ(7,9,7V). (7) 

From jni and Q it follows to leading order in N that 

Z{-f,q,N)^ f dc exp(7V[w(c,7) + clng]) (8) 
Jo 

and performing the integral by the Laplace method for large N we find 
that (^(7,9) and cj(c,7) are Legendre transforms of each other: 

(^(7, g) = max [w(c, 7) + clng] cj(c, 7) = min [(^(7, g) - clng] (9) 

,^ ao, 

The large deviation properties of the ensemble of random graphs as char- 
acterized by Lu{c, 7) can hence be inferred from (^(7, q). In the next section 
we show how (^(7, q) can be obtained from the statistical mechanics of the 
Potts model. 

For later use we also note that from differentiating (|7|) with respect to 
7 we find using © and l(2Jl to leading order in N 

%9) = |+7^. (11) 

Here ^(7, q) denotes the average number of edges per vertex in the graph 
where the average is performed with the distribution jsj- 



3 Thermodynamics of atypical graphs 
3.1 The mean-field Potts model 

It has long been known that certain characteristics of random graphs 
are related to the thermodynamic properties of the Potts model ^| . 
The Potts model is defined in terms of an energy function £^({17^}) de- 
pending on N spin variables ai,i = 1, . . . , N , which may take on q distinct 
values a — 0,1, q — 1. In the mean-field variant the energy function reads 

i<j (7—0 i 
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Figure 1: Solution so(/3, q) of the saddle-point Eq. (|16|l as function of f3 for 
q = 1, 2, 4, 6 (from left to right). For q < 2 the non-trivial solution so > 
branches off continuously from the high-temperature solution sq = at 
P = q. For q > 2 the new solution appears discontinuously at the spinodal 
point (3^ < q hy a subcritical bifurcation. 



where huc^ is an auxiliary field parallel to the direction cr. The thermo- 
dynamic properties of the system at inverse temperature /3 can be derived 
from the partition function 



Zip, q, N)^J2 eM-pEi{a,})) 



(13) 



where the sum runs over all q^ spin configurations {ai}. A standard anal- 
ysis (cf. the appendix) gives for the free energy 

/(/3, h, q, M) = - lim -i- In Z(/3, h, q, {u,}, N) (14) 

at h = the result 



/(/3, q) = extr 



1 g-1 

2q 2q 



P 



In q 



l±il-2>l ln(l + (, - l),o) + ^(1 - .0) ln(l - .so) 
Pq Pq 



(15) 



The saddle-point value so{P,q) extremizing the expression in the brackets 
is the stable solution of the equation 



,/3so ^ l + (g-l)50 _ 

1 - So 



(16) 
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Clearly So = is always a solution of this equation. It is, however, unstable 
for large P and another, non-trivial solution becomes stable which describes 
the spontaneous appearance of order in the low temperature phase. Fig.^ 
displays the solutions of Hlt)|l as function of (3 for different values of q. Note 
the subcriticial bifurcation in so(/3, q) for q > 2. 



3.2 Diagrammatic expansion of the Potts model 

The relation between the Potts model and the random graph ensemble 
introduced in section |2] becomes apparent when considering the high- 
temperature expansion of the free energy H14(l of the Potts model. Since 
the Kronecker delta can take only the values zero or unity, the partition 
function can be recast into the form 

Z{P,h,q,M,N)^Y. l[^^ + ^Si<^^,<^J)] e'3'^S„«.i:.5(-.,-), (17) 
where 

- = exp(|)-l = | + 0(i,). (18) 

When expanding the product appearing in (|17|) we obtain a sum of 
2JV(Ar-i)/2 |;ej-]-|2s each of which is in one-to-one correspondence with a 
graph. The N vertices of this graph represent the Potts variables Ci, 
whereas an edge stands for a factor w S{ai,aj). Performing the trace 
over the configurations {ci} for each term in the sum, i.e. for each graph, 
separately, the Kronecker deltas constrain the Potts variables belonging to 
one component of the graph to the same value. As a result we find the 
Potts partition function as a sum over graphs in the form 

C(G)-1 

Z(/3,/.,9,K},iV) = 5]^^(«) n (E^'"""'") (19) 

G n=0 o- 

where the product is over all components of the graph and Sn denotes the 
size of the n-th component. We will assume that n = refers to the largest 
component. 

From lfTO|l and (fTH|l we find 

Z{(3,h = 0,q,N) = Y,ij7) f^""^- (20) 



G 



and comparison with (jSJ and jSJ yields to leading order in N 



Z{-f,h^O,q,N)=e-Zij,q,N). (21) 
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Correspondingly from ||7)l and (|14|) it follows that 

/(7,'?) = -J--^(7,'7)- (22) 
2 7 

Eqs. (|21|l and (|22|l establish the relation between the random graph ensem- 
ble defined in section |21 and the statistical mechanics of the Potts model 
sketched in section IXTl In particular we obtain from (|22|l and (|15l) 



villi) = 6xtr 

So 



7 9-1^ 2 



(s^-l)+ln(7 



2 g 

ln(l + {q- l)so) - - so) ln(l - sA (23) 



from which a;(c, 7) follows with the help of the Legendre transform (|5Jl, 
(|1(J|) . The equation for the saddle-point value so(7, q) in H23|l is from 

e-- = ^ + (^-^)^°. (24) 
1 - So 

Differentiating 119|l for Ua — 5{a^ 0) with respect to h, sending first N 00 
and then h ^ from above, one can show that the stable solution so(7i 9) 
of H24() is nothing but the average fraction of vertices in the largest compo- 
nent So = Sq/N in an ensemble of random graphs with biased probability 
(jnj. Hence the phase transition in the Potts model describing the appear- 
ance of a spontaneous magnetization at sufficiently low temperature 1//3 
corresponds to a percolation transition in the random graph ensemble giv- 
ing birth to a giant component with extensively many vertices at sufficiently 
large connectivity parameter 7. 

We also note for later convenience that using H24|) in H23|) the expression 
for 93(7, (7) can be rewritten as 

^^(7,9) = -J ^ il + sD-^+Hq-l + e-^^"). (25) 
z q q 

It is finally useful to write (|19|l with j3 replaced by 7 in the form 
Z(7, h, q, N) = Z(7, h^O,q, N) e'^ 

C(G)-1 

exp( J2 ln(Ee^'"'^"^''^)-C'(G)lng)) , (26) 

n— cr 

where the average (...) is with respect to the biased probability (0). Sin- 
gling out a possible giant component of size So = Nsq and grouping to- 
gether all small components of the same size we then obtain for the free 
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energy ((Till 



f{-f, h, q, {u„}) ^ /(7, h^O,q) + 

lim -l-ln(^exp{N[-fhso{G) + ^ ^(5', G) In^ e^''""^ - clng])^ . 

(27) 

Here we have introduced the number of components of size S of graph G 
divided by N 

C(G)-1 

i>{S,G) = - J2 SiS,Sr.{G)). (28) 

Eq. 1)27(1 forms a suitable starting point for the characterization of the 
distribution of small components from the Potts free energy. 

3.3 Properties of atypical graphs 

The connection between the Potts free energy and the component gen- 
erating function of Erdos-Renyi graphs allows to elucidate several large 
deviation properties of the random graph ensemble. First we get for the 
average number of edges per vertex from 111(1 and 1(23(1 

e{j,q) = ^{i + iq-i)4h,q)) ■ (29) 

The dependence of £ on the relative size of the giant component so (7, q) for 
q =^ 1 indicates a non-trivial internal organization of edges in rare graphs. 

For the number of components of graphs dominating the distribution 
(O we find from ^ and ^ 

c(7,(z) = (l-so(7,'7))(l-^(l-so(7,g))) • (30) 

The above equations already give access to some microscopic information 
on edges and vertices belonging to the giant component or to the small 
components. Call and lout the average numbers of edges inside and 
outside the giant component divided by N respectively. Obviously, £in + 
•^out = £■ In addition, since almost all small components are trees (cf. 
section ICT . the number of these components is related to the number of 
edges they contain through c = 1 — so — •^out- From these two relations, we 
obtain 

4.(7,9) = ^(2so + (g-2)s2) 
4ut(7,'7) - ^{^-^oY (31) 
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Figure 2: Average degrees ^1,1(7,(7) (dashed top) and dout{l,q) (full) as 
functions of q according to for 7 = 0.25 (left) and 7 = 8 (right). 
The symbols indicate numerical results (diamond=inside, circle=outside 
the giant component). The statistical error bars are much smaller than the 
symbol size. 



from which we deduce the average degrees 

din{l,q) = ^(2 + (q-2)so) 
q 

dontil,q) = -{I -so) (32) 
q 

of vertices inside and outside the giant component respectively. The depen- 
dence of these degrees on q for one particular value of 7 is shown in Fig. |21 
together with results from numerical simulations described in section 

In order to calculate the complete spectrum uj{c, q) using the Legendre 
transform Q we need to know ip{j,q) for general real g > 0. We have hence 
to study the extremization over s in (|23|l for fixed 7 and variable q. This 
is somewhat complementary to what is done in the statistical mechanics of 
the Potts model where the free energy H15() is minimized for integer q > 2 
and different values of p. Here we have to keep in mind that the extremum 
in (|23|l is a minimum if q > 1 but a maximum if < q < 1^. 

The dependence of 1^(7, q) on q is qualitatively different for 7 < 2 and 
7 > 2 as shown in Fig.O For 7 < 2 the stable solution of H24I) is positive for 
g < 7, goes to zero for 5^7, and is identically zero for q > 7 (cf. left inset 
in Fig. Accordingly ip{j,q) shows a second order phase transition at 

^The reason for this is that due to the constraint il2UI the free energy depends on 
{q — 1) variables, a number which becomes negative for g < 1. 



10 



Figure 3: Free energy 1)231) of the random graph ensemble as a function of 
q for connectivities 7 = 1.8 (left) and 7 = 4 (right). The full and dashed 
curve correspond to the small clusters phase (sq = 0) and the giant com- 
ponent phase (so > 0) respectively. Both free energies coincide in q = 1 
(point A). For 7 < 2 (left), a second order phase transition arises when q 
crosses 9 = 7 (point B) with the size of the giant component approaching 
zero continuously (left inset). When 7 > 2 (right), the transition takes 
place at point C with abscissa 5^ > 7 and is first order. Both the slope of 
the free energy and the size of the giant component (right inset) are discon- 
tinuous at the transition. Branches BD and CD correspond to unstable 
(local maximum) and metastable (secondary local minimum) solutions re- 
spectively. 



g = 7 as displayed in the left part of Fig. 13 For 7 > 2 the small-q solution 
So (7,?) > remains stable up to g = > 7 and coexists for 7 < g < g^ 
with the solution so = which is stable for g > 7 as before, see right inset 
in Fig. 13 Accordingly the phase transition is now first order and takes 
place at the Maxwell point g — q^ where the two values of (^3(7, g) coincide 
as shown in the right panel of Fig. O At the transition, the value of sq 
jumps discontinuously, and so does the derivative of (p{'y, g) with respect 
to g. 

Let us now turn to the discussion of cl;(c, 7). The bifurcation point g = 7 
of (p{'y,q) maps according to @ and (|10|l onto c = 1/2 for all values of 7. 
On the other hand the different behaviour of (^(7, g) for 7 < 2 and 7 > 2 
impfies qualitative differences of a; (0,7) in the two cases as well. 

For 7 < 2 we find from the Legendre transform © that for all values of 
g there is exactly one corresponding value of c. Accordingly the transition 
from the percolating phase sq > at c < 1/2 to the small component 
phase at c > 1/2 is smooth as shown by the curves for 7 = 0.25, 1,2 in 
Fig. ^ Except for g = 1 the appearance of the giant component takes 
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c 



Figure 4: Logarithmic probability distribution uj(c, 7) of the number of 
components per vertex, c, for different values of the connectivity parameter 
7 = 0.25, 1, 2, 3 (left bottom to top), cu is maximal and zero for the most 
probable fraction of components c* given by H47|) . For 7 < 2, there is 
a second order percolation transition at c = 1/2 (points B) marking the 
appearance of a giant component for c < 1/2. When 7 > 2, a first order 
transition separates the giant component phase (left to point C_ ) from the 
phase without giant component (right to C_|_). In between, both phases 
coexist and the convex hull of uj is linear in c (dashed line). 



place in graphs with exponentially small probabilities, lj{c = 1/2,7) < 0. 
For 7 < 1 this happens in the increasing part of a;(c, 7), for 7 > 1 in the 
decreasing one in accordance with the fact that the slope of a;(c, 7) is given 
by — Inq, cf. (|10|) . and that 7 = q at the transition. 

For 7 > 2 the first order transition in (/?(7, q) implies via Q that for one 
particular value of q, namely q = q^ , there are two corresponding values, 

and c^, of c. Hence the biased probability distribution P(C; 7, g^^, TV) 
is bimodal and a;(c, 7) is non-convex in the interval cY < c < c^. At the 
same time the Legendre transform does only yield the convex hull of 
Ct;(c, 7) and therefore includes a linear part with slope — Ing^ interpolating 
between oj{c^,j) and uj{c^,j) as shown exemplarily for 7 = 3 with the 
dotted line in Fig. 0] A random graph ensemble generated according to 
P(C; 7, g*^, TV) is hence inhomogeneous in the sense that it contains real- 
izations with c = (and with giant component) and with c = (and 
without giant component). The value of c in such an ensemble depends 
on the relative fraction of these two realizations and is determined by pre- 
exponential factors in P{C; 7, q^ , N). The fraction of realizations without 
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Figure 5: Comparison between analytical (full lines) and numerical (sym- 
bols) results for the logarithmic probability uj{c, 7) of Erdos-Renyi graphs 
with atypical number of components for 7 = 0.25 (left) and 7 = 8 (right). 
The simulations were done for A'' = 1000 and are described in sectional 
the statistical error bars are much smaller than the symbol size. The big 
black dots have the same meaning as in Fig. ^ 



giant component is zero for c = c^, increases linearly with c, and reaches 
one at c = c^. 

The above analytical results for a;(c, 7) including the bimodal distri- 
bution P(C;7, g,7V) for q = q^^ are in very good agreement with exten- 
sive numerical simulations described in section [S] This is exemplified for 
7 = 0.25 and 7 = 3 in Fig.El 

For c > max(l/2, c^), i.e. in the region where so(7, <?) — 0, it is possible 
to perform the Legendre transform ^ analytically to find 

u;(c, 7) = -2 + (1 _ c)(l + In I - ln(l - c)). (33) 

Hence we have ti;(c = 1,7) = —7/2 for all values of 7 which is, of course, 
consistent with Fig. 0] This result holds as long as 7 is finite. Another 
interesting large-g limit is obtained if q and 7 tend to infinity simultane- 
ously with the ratio r — lnq/7 being kept constant The tendency 
to prefer graphs with many components implied by q ^ 00 may then be 
counterbalanced by the large connectivity parameter 7. In fact for r > 1/2 
we have q > q^ {"f) and hence sq = which brings us back to On the 
other hand for r < 1/2 we find from (|24|l to leading order sq = r and hence 
from H30|) c = 1 — r. Therefore in this case 7 is large enough to set up a 
giant component while the other vertices are essentially isolated in order 
to make the number of components as large as possible. 
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The opposite limit c — ^ corresponds to g — > 0. The random graph 
ensemble is for very small q dominated by graphs with very few components 
and for g — > only fully connected graphs (i.e. those with C — 1) survive. 
From H24|) and (|25|l we find in this limit 

so(7,9)-l-^7^ + 0(q2) (34) 

^(7, q) = ln(l - e--') + q ^^^^ + 0{q'). (35) 
This results via (QUI) and lO in 

cil,q)^q^^^ + Oiq') (36) 
consistent with C I and 

i^icj) = ln(l - e-^) + q ^^^^-^(1 - In?) + Oiq^ Inq). (37) 

We hence find uj{c = 0, 7) = ln(l— e^'''). This again agrees with Fig.^jand is 
moreover in accordance with the known rigorous result that the probability 
for an Erdos-Renyi random graph to be connected is asymptotically given 
by (1-e-'')^ HI]. 

We may finally extract useful information on the size distribution of 
small components from the Potts free energy. Let us denote by 

V^(5,7,g) = ^P(G;7,g,iV)V^(5,G) (38) 

G 

with ip{S, G) defined by H28|l the average distribution of small components 
in a graph ensemble characterized by the biased distribution P(G; 7, q, N). 
Consistent with the meaning of tlj{S, G) we then find to leading order in TV 

^V^(5,7,g) = c(7,g) (39) 
s 

J2HS,l,q) S = l-so{l,q) ■ (40) 
s 

To get in addition an expression for the second moment of ^fj{S,j,q) it is 
useful to consider the second derivative of /(7, /i, {w<t}) with respect to 
h at h ~ for field configurations with 

uq = 1 and < 1 for cr = 1, g — 1 . (41) 
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Figure 6: First and second moment of the distribution of non-extensive 
component sizes ■0(5', 7, g) as function of q for 7 = 0.25 (left) and 7 = 8 
(right). Full lines are the analytical expressions describing the thermo- 
dynamic limit N —^ 00, symbols give results of numerical simulations for 
N ~ 1000 described in section |5| 



Denoting by (s§)c = (sg) — (so)^ the second cumulant of the relative size 
of the giant component and using the abbreviations 

u = - Ucr and = - ui (42) 

q q ^-^ 

on can show from (|27() that 

0(7, = 0, g) = (^? - u-") i^iS, 7, 9) 52 + (1 - u)2 N{sl), . (43) 

On the other hand one finds from (|122|l for the same quantity after some 
algebra 



7 dh"^ ' ' 9 - 7(1 - So) 

+ (1 _ r,? q^ soil -so) 

^ ' (q-7(l-so))(g-7(l-so)(l + (9-l)so)) • ^ ^ 

Since and (jHI) must be identical for any choice of the fields 
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consistent with (|41|l we arc left with 



9 (1 - So) 



(45) 



s 



9-7(1 - So) 




J_ sq (1 - So) 

N {q- 7(1 - So))iq - 7(1 - So)(l + iq- l)so)) ' 



(46) 



Fig. shows the first and second moment of ip{S,j,q) for two values of 7 
as function of q together with results from the numerical simulations. Note 
that for 7 < 2 the soft transition at 7 = 5 gives rise to a diverging second 
moment of '0(5', 7,9) (left panel) whereas for 7 > 2 it remains finite at 
the transition (right panel). Accordingly the finite size corrections at the 
transition are much larger in the first case. 

It appears to be possible to extend the above procedure to obtain also 
higher moments of ip{S, 7, q), however the calculations become increasingly 
tedious. We have not been able to derive a closed expression for the com- 
plete distribution tp{S,^,q) from the Potts free energy except for the case 
q — 1 which is discussed in the next section. A general expression for 
^{S,j,q) will, however, be derived in section 14.61 using our microscopic 
approach. 

3.4 Properties of typical random graphs 

In this subsection we rederive some of the central results for typical graphs 
as special cases of our more general framework. As discussed in section |21 
the random graph ensemble is for large values of N dominated by graphs 
contributing to the maximum of w(c, 7). Since at this maximum duj/dc — 
we find from (|10|l that typical properties of random graphs can be ex- 
tracted from the Potts free energy in the vicinity of g = 1. This is well 
known |15| and is, of course, also clear from the definition (jS)) implying 



P{G-n,q = l,N) = P{G-n.N). 

Explicitly we find for the typical number of components c* from IjlUfl 
and 



where the relative size of the giant component is the stable solution of 
the equation 



which follows from (|24|l for g = 1. Eqs. H47|l and (|48|) are classical results 
of Erdos and Renyi |Tj . For small values of 7 almost all components of the 
graph are small trees. Hence Sq — Q and each edge reduces the number of 



c* = 



(47) 




(48) 
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Figure 7: Properties of typical random graphs. Shown are the fraction Sq of 
vertices in the largest component (left), and the number c* of components 
per vertex (right) as function of the connectivity parameter 7. The vertical 
dashed line, 7=1, indicates the location of the percolation transition. 



components by one. With the typical number of edges per vertex given by 
(cf. (HTJ and 123)) 

r=Z(7,g=l) = ^ (49) 

this implies c* = 1 — 7/2 which coincides with H47|l for Sq = 0. For 7 > 1 
there is on average more than one edge attached to each vertex and hence 
the connectivity may spread out through the whole system resulting in 
the emergence of a giant component. Its size Sg is an increasing function 
of 7. At the same time the giant component has a denser connectivity 
than a tree involving loops which slows down the decrease of the number of 
components c* with 7 as described by H47|) . The dependence of Sq and c* on 
7 is shown in Fig. [71 The reason for the remarkable similarity between the 
results (|47f) and H3U|) for the number of components in typical and atypical 
graphs respectively will become clear in section [4.51 

For the case of typical graphs considered in this subsection it is possi- 
ble to obtain some more detailed results. A simple application of Bayes' 
equation for conditional probabilities 19 yields the complete degree distri- 
bution inside and outside the giant component. The probability of vertex 
to have d edges is given by . The probability not to belong to the giant 
component conditioned to having d edges is clearly P{out;d) = (1 — Sq)'^. 
The complementary probability to belong to the giant component is hence 
P{m;d) = 1 — (1 — Sq)''. Then from Bayes' theorem we get for the proba- 
bility to have degree d conditioned to being not part and being part of the 
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giant component respectively 

P(out;d)P(d) _ jl -s*r -^(i-.s)(7(i^io))! 
^ " P(out) ^ l^s* ^ d\~ d\ 

(50) 

Pr„(^)^^™^^i^^e-^. (51) 
P(m) Sq d! 

The last equality in H5U|) in which we have used (|48|l shows that the degree 
distribution outside the giant component is still Poissonian. On the other 
hand the distribution inside the giant component clearly deviates from a 
Poissonian law. Calculating the averages of the distributions (|50|l and H51() 
we find 

C(7) = 7(2-4) 
dlAl) = 7(1-4), (52) 

consistent with (|32|l iox q= 1. 

Also the complete distribution of component sizes can be determined 
from the Potts free energy. In fact for the special choice = (5(cr, 0) we 
find from f?7|) 

.^[^^h,q = l)^Y.^*{S,-f)Se-''^' , (53) 

with ■il)*{S,^) = ^{S,j,q = 1). On the other hand from H122() we get for 
the same configuration of the fields Ua the result 

^-L.(j,h,q=l) = l~:3o{l,h) , (54) 

where So (7, h) is the solution of 

1 - So ^ e-^^''+'"\ (55) 
Comparing (|53|l and H54|) we hence find 

oc 

Y,riS,l)Se~''^'' ^l-Soh,h) . (56) 
s=i 

From (|55|l and H56|) it is straightforward to produce equations for all the 
moments of V'*('S', 7) through successive differentiation in ft, = 0. A more 
direct way to obtain 7) is to get from (|55|l the explicit dependence of 

§0(7, h) on 7 and h with the help of the Lagrange inversion theorem |21j 

1 ^ cS-l -ihS 

' s=i 
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From H56|l and (|57|l we then infer 



Y.r{S,l)S e-^^' ^^Y. ^"") ■ (58) 



S=l ' S=l 



Matching powers of e '^'^ we finaUy obtain 



r{S,l)^-^{ie-''f , (59) 
7 »j ■ 

another classical result of Erdos and Renyi . For the second moment of 
this distribution we easily find 

7(1 - »S) 



S=l 



which reproduces (|45|l for q = I. We also note that from the complementary 
equation we get for g = 1 

/„2\ _ 1 ^0 (1 ^ ^o) ^Q^-^ 



N (g-7(l-s5))2 



a result consistent with rigorous findings about the fluctuations of the rel- 
ative size of the giant component of typical Erdos-Renyi graphs \22\ . 

4 Evolution of atypical graphs 

In the present section we will look at rare graphs from a more microscopic 
point of view focusing on individual vertices and edges. Our aim will be 
to rederive several of the thermodynamic results presented above without 
reference to the Potts model. To this end we will study the evolution of 
rare random graphs under the addition of a new vertex or a new edge. 
This is similar in spirit to the so-called cavity method in the statistical 
mechanics of disordered systems |14) . The main motivation of what follows 
is to find an alternative way to quantitatively characterize rare graphs. It 
may be helpful in the analysis of graphs which are atypical with respect 
to other properties than the number of components, as e.g. the size of 
the giant component or the number of loops. In these cases the relation 
to the Potts model is no longer helpful and no thermodynamic approach 
seems to be known. Finally we will also derive some new results including 
the complete degree distributions inside and outside the giant component 
and the size distribution of non-extensive components. These results were 
obtained above for typical graphs only. 
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4.1 Energetic versus entropic costs 



From a microscopic point of view we may identify two qualitatively different 
reasons for the exponentially small probability of a graph G. On the one 
hand the number of edges in the graph may deviate by an extensive amount 
from the typical number. On the other hand the distribution of edges 
among the vertices of the graph may differ from the typical one. We will 
refer to these two different sources for an exponentially small probability 
as energetic and entropic contribution respectively. 

The energetic cost is completely fixed by the probability distribution of 
edges. The probability for a random graph to have L = IN edges is given 



The expression in the brackets is zero ior £ ^ £* =7/2 reproducing (|49|l . 
It is negative for I ^ £* and hence all other values of I have probabilities 
exponentially small in N . 

To leading order in N wc find from (|62|) 



which gives the change in the energetic contribution to the probability when 
one edge is added. For £ < 7/2 the probability increases by the insertion of 
an edge, for £ > 7/2 it decreases in accordance with the fact that £ = 7/2 
is the typical case. 

Let us then consider a graph with N vertices, no giant component, and 
an atypically large number C of non-extensive components. A possible 
realization of such a graph has all components as trees and an atypically 
small number, L = N — C, of edges. However, this may be not the optimal 
way to build the graph. In fact from we see that the probability of 
the graph increases by a factor of order 1 if we add another edge. On 
the other hand, in order not to decrease at the same time the number of 
components we have to put the new edge between two vertices of one of 
the already existing components. For a component with S = 0(1) vertices 
(and hence {S — 1) edges) the chance to put the new edge between two 
of its vertices is roughly (5* - 1){S - 2)/N'^ = 0{N-'^). Multiplying by 
the number of components we find that the probability not to reduce this 
number by putting the new edge is of order 0{1/N). For large N this 
decrease in probability cannot be compensated by the 0(1) energetic gain. 
Hence, also in the case of rare graphs the non-extensive components are 
predominantly trees. 



by (cf. ©) 




(62) 



P(L+l;7,7V) = _Lp(L;7,iV), 



(63) 
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The situation for graphs G without giant component is hence rather 
clear. Since all components are trees their number is given by C{G) = 
N — L{G). This implies a simple relation between the generating functions 
for rare graphs (q ^ 1) and typical graphs [q = \) which follows from © 
and 

Z,„=o(7, g, N) = g^Z,„^o(^, 1, N). (64) 

Therefore these graphs are characterized by an effective connectivity pa- 
rameter 7/(7 and the number of edges per vertex is given hy J — j/{2q) 
consistent with H29() for 'sq — 0. The probability of such a graph is solely 
determined by the energetic cost yielding 



P(G; 7, g,iV, so = 0) = cxp [N 



2^(1-. + In,) 



(65) 



Replacing q in this expression by c according to I|1U|) and (|23|l we find back 
the logarithmic probability H33() . 

The situation changes in the presence of a giant component of size 
5*0 = 0{N). From the same kind of reasoning as used above it is clear, 
that the entropic cost for putting an additional edge inside the giant com- 
ponent is 0(1) and may hence well be over-compensated by the energetic 
gain in probability ^. The precise balance between energetic and entropic 
contributions in this case will be investigated in the next subsection. 



4.2 Adding an edge 

To quantify the entropic contribution to the probability let us consider the 
probability P{C; L) for a graph with L edges to have C components. Upon 
adding one more edge the number of components may change and we have 
quite generally 

P(C;L + 1) = ^if(AC)P(C + AC;L). (66) 

AC 

The kernel K{AC) is easily determined. If the new edge lies with both 
ends inside the giant component of the graph the number of components 
does not change at all, otherwise it is reduced by one, 

K{AC) = si 6{AC, 0) + (1 - si) 6{AC, 1). (67) 

^We expect the probability to have more than one extensive component to be negh- 
gible for the graphs atypical with respect to the number of components considered here, 
similarly to what happens for typical random graphs 1231 . See also section [4.6l 
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Combining and we hence find for the probabihty P(C, 7, N) = 
P{C; L)P{L; 7, N) the evolution equation 

P(C, L + 1; 7, iV) = ^ {si P{C, L- 7, iV) + (1 - si) P{C + 1, L; 7, iV)) ■ 

(68) 

For the biased probabihty 

P{L- 7, q, N) = ^ 5^ P(C, L; 7, A^) g^, (69) 

where ^(7, g, A^) is defined by © this imphes 

P{L + 1; 7, TV) ^ _2_ (1 + (q _ p(L; 5, N) (70) 

as follows from (|68|l by multiplying with g*-^ and summing over C . Summing 
H7(J|I over L and using the fact that this sum is dominated by graphs with 
i = Z(7, q) and sq = so(7, q) we find for the average number of edges per 
vertex 

I=^{l + {q-l)sl). (71) 
z q 

reproducing (|29|l . 

In a similar way we may rederive the results for the average number 
of edges inside and outside the giant component. To this end we decompose 
the kernel H67|) into contributions corresponding to the cases of the new edge 
being connected to the giant component or not 

K{^C) = i^in(AC) + ifout(AC). (72) 

Clearly 

ifi„(AC) = si S{AC, 0) + 2so(l - so) S{AC, 1) (73) 

i^out(AC) = (l-so)'<5(AC,l). (74) 

Proceeding as above we get for the biased probability to have a graph 
with L + 1 edges with the last edge added being connected to the giant 
component 

PiniL + 1; 7, g, Af) = ^ (2 + (g - 2)^^) P{L; 7, g, N). (75) 
Summing over L this gives 

Pnh,q,N)^^{2+iq-2)sl) (76) 
zlq 
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and hence 

lin = ^Pin(7, q,N)^^{2+{q- 2)sl), (77) 
I q 

which is identical with Similarly one may rederive the result for £out- 

The results for the average total number of edges and the average number 
of edges inside and outside the giant component respectively of atypical 
graphs are hence directly linked with the balance between energetic and 
entropic contributions to the probability of these graphs. 



4.3 Adding a vertex 

Several interesting results may be obtained by investigating the evolution of 
atypical graphs under addition of a new vertex. Compared with the same 
procedure for typical graphs some special care is needed in the present 
case of atypical graphs. The reason is the following. In order to keep the 
statistical properties of the new vertex as simple as possible we will assume 
that it is characterized by the simple Poissonian degree distribution 
In this sense we add a typical vertex to an atypical graph. This in turn 
implies a change in the "degree of unlikeliness" of the graph which needs 
to be monitored. 

To make this argument more quantitative consider the following basic 
step of adding one vertex. The probability of a graph G with N vertices 
and parameter 7 is from ^ given by 

F(G;7,iV)=e-^+'^(^-i + ^H°W (^)'^''\ (78) 

and depends on its number of vertices, L{G), only. The new vertex is 
assumed to have d incident edges with probability P(d;j) = 'j'^/dl, cf. 

There are (^) different ways to connect these d edges with existing 
vertices. The new graph is one of these possible "wirings" , and has therefore 
probability 



P{d-n)-j^PiG;^,N) 



1 



But the new graph, G' , is one particular graph with + 1 vertices and 
L{G') ~ L{G) + d edges. Its probability should therefore be 



P(G'; 7, TV + 1) = + (jf^) 



L(G)+d 

■ (80) 
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Equality of expressions (|79|l and (|80l) imposes that 



I = ,„fi^-)"°'"' = £M + 0(1). (81) 

This is fulfilled for typical graphs since for these the mean number of edges 
per vertex is indeed 7/2, of. However, if we add a typical vertex to an 

atypical graph with I ^ 7/2 we have to introduce an extra multiplicative 
weight factor 

u;(7,g) =exp(|-^(7,g)) (82) 

in order to make the new graph an unbiased representative of the new 
ensemble. 

Let us now investigate how the probability P{C;j,N) for a graph to 
have C components as defined in changes when we add a new vertex. 
The number of components will decrease by a stochastic amount AC, and 
we have similarly to (|66|1 

P{C; 7, iV + 1) = ^ K{AC) P{C + AC; 7, N). (83) 

AC 

The new kernel K has now to comprise both the extra weight H82|l of 
the new vertex and the probability for the change AC when adding the 
new vertex. The degree d of the new vertex, which is also the number 
of new edges added to the graph, is a stochastic variable with Poissonian 
distribution e~'^j'^/dl. Of these d edges, do may be connected with the 
giant component whereas the remaining d — do ones are connected with 
small components which (with probability 1 for iV — > 00) are all different 
from each other. The number of components is hence reduced by d — c?o, 
except for the case do = where it changes by d — 1. We therefore find 

if(AC) =5]e-i-^ ^ J2 (^)4" (l-^o)''-*' SiAC,d-do-Sido,0)) 



d>0 do=Q 



(84) 



where sq is the relative size of the giant component prior to the insertion 
of the new vertex. 

In order to obtain results for atypical graphs from (|83|) we again multiply 
by and sum over C to find 

Z(7,g,7V+l) = I](7,g)Z(7,g,Af). (85) 

where 

E(7,g)-^i?(AC)g-'^^. (86) 

AC 
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and K{AC) results from ii'(AC) by replacing £ and sg by ^(7, q) and 
So (7, 9) as given by and ifTTl) respectively. Using performing the 
sum over AC, do and finally over d we are left with 

I](7,g) = (g-l + e^^°) exp(- ^(1-so)). (87) 
Iterating H85(l we find 

lim llnZ(7,g,iV)=lnS(7,g). (88) 

N^oc I\ 

Comparison with Q and insertion of (|71|l shows that H87|) reproduces the 
result for the free energy (p{j,q) in the form (|25|l . To complete the red- 
erivation of results of the thermodynamic approach we still have to produce 
the self-consistent equation H24() for so(7j'?)- 



4.4 The giant component 

More detailed results can be obtained by again decomposing the kernel 
K{AC) in H83|l into different contributions. E.g. in order to reproduce 
the self-consistent equation for sq we decompose K{AC) into parts corre- 
sponding to the possible values of do: 

KiAC) = Kd„{AC), (89) 

do>0 

where 



00 d / J\ 

K,^{AC)=ei-' Y: ^-'^[l)4°i^-^o) 



d-do ^ 



6{AC,d-do-S{do,0)). (90) 

For the probability of a graph with iV + 1 vertices to have C components 
and the last vertex added making do connections with the giant component 
we then get 

P(C, do; 7, ^ + 1) - E (AC) P{C + AC; 7, N). (91) 

AC 

Multiplying with q"-^ /Z{j, q,N + 1), summing over C, and specifying to 
the case do = we get for the biased probability that the new vertex does 
not belong to the giant component 



P(do = 0; 7, g, iV + 1) = /jv+1) ^ ^doM^C) q-^^ Z{j, q, N) 

^do=o{'i,q) 



Eh,,) • 
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where we have used ©, and introduced 

Sdo=o(7, q) = ^ ifdo=o(AC) q-^^. (93) 

AC 

Performing the sum over AC and d we find 

Sdo=o(7,g)=<? exp(^-|-Z+ J(l-so)). (94) 

On the other hand for large N the probabihty H92|l has to be identified with 
1 - so(7,g)- Using (|HZ|) and ijMI) this yields finally 

g — 1 + e^'^o 

which coincides with (|24|l . 

4.5 The degree distribution 

It is finally possible to derive expressions for the complete degree distribu- 
tion in rare graphs by decomposing the kernel K{AC) in (|83l) into different 
parts according to the value of d (rather than do as done above) 

K(AC) =^i^d(AC), (96) 

d>0 

where now 

K4AC) = e^i-' J2 (1 - sof-'-x 

do=0 

6{AC,d-do-S{do,0)). (97) 

For the probability of a graph with iV + 1 vertices to have C components 
and the last vertex added having d edges this implies 

P{C, d;^,N + l)=J2 Kd{AC) P{C + AC; 7, N), (98) 

AC 

Multiplying by q^/Z{-f, q, N +1) and summing over C we find for the biased 
probability that the added vertex has degree d, 

P(d;7,<7,iV+l) = Mll^, (99) 
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where we have again used (|85|l . and defined 



AC 



Calculating explicitly the above sum, and using 1)8 7|) and H95f) . we obtain 
the degree distribution, 



Q 9 



(101) 

This distribution reduces to the Poissonian law expected from for 
5=1 only. For q ^ 1 we find deviations from a Poissonian degree distri- 
bution, where for large values of 7 and q even a bimodal distribution may 
occur. For the average degree we obtain from (|101|l 

d{j,q)^l{l + {q-l)sl{j,q)) (102) 
q 

where we have made use of the self-consistent equation (|95f) . Since each 
edge is connected with two vertices this result is consistent with H71|) . 

While 1)1011) gives the distribution of degrees for a randomly chosen 
vertex in the graph, we may ask for more detailed information depending 
on whether the vertex belongs, or does not belong to the giant component. 
Let us call Pin{d;^,q) and Pout('^;7,9) the biased distributions of degrees 
for a vertex, respectively, inside and outside the giant component. The 
generalization of the above calculation is straightforward. Pout('^; 7, q) and 
■Pin (c^; 7, 9) are obtained from specializing the kernel K to d^do = and 
d,l < do < d respectively. The calculations are very similar to the one 
presented above, the results read 



d 

(103) 



P.n(d;7,9) = ^e-?(^-°) 1 
qso dl 



(104) 



These equations give a rather detailed description of the connectivity in 
atypical graphs. For q = 1 they reproduce l|K(Hl and (|5T|l respectively. The 
corresponding average degrees dout and d-m are in agreement with (|32|) . The 
remarkable fact that Pout (rf; 77 9) = ■Po*ut('^j 7/?) generalizes the mapping 
to the case So 7^ and explains the similarity between the expressions 
ijTfjl and ipn|l for the number of components in typical and atypical graphs 
respectively. For large 7 and q the distribution Pout (c^; 7, 9) is peaked at 
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Figure 8: Degree distributions Pin{d;"f,q) (dashed top) and Pout{d',j,q) 
(full) as given by (|103|l and (|104|l as functions of q for 7 = 3, d = 2 
(left) and d — 3 (right). The numerical results are shown by symbols (di- 
amond=inside, circle=outside the giant component). The dashed vertical 
line indicates the critical value q^^ , where the giant component ceases to 
exist. The statistical error bars are much smaller than the symbol size. 



small values of d whereas Pin((i;7,g) is maximal for larger d which gives 
rise to the possible bimodal form of the total distribution P{d;j,q). We 
also note 

Pout{d=l;j,q) = Pin{d=l;-f,q) = P{d^l:,j,q) (105) 

for all values of 7 and q showing the special role of leaves in the graphs. 
For all other values of d the distributions Pout('^; 7, l) and Pin(rf; 7, q) differ 
from each other. Of course Pin{d = 0;"f,q) since no isolated vertex may 
belong to the giant component. 

Fig. |S1 shows the degree distributions inside and outside the giant com- 
ponent for d = 2 and d = 3 at 7 = 3 as function of q together with 
results from numerical simulations. With increasing q the biased distri- 
bution (0) gets more and more dominated by graphs with many compo- 
nents. From Fig. |S1 we infer that in this process both Pout{d = 2; 7, q) and 
Pin{d = 2; 7, q) increase. Nevertheless P{d = 2, 7, q) and therefore the total 
number of vertices carrying two edges decreases due to the shrinking of the 
giant component (cf. left inset in Fig. . 
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It is interesting to note that 



P'{d) = ^^.'^"^ ._ Pin(rf; 7, q) + Poutid; 7, g) (106) 

l + (9-l)so l + (q-l)so 



Ki+(g-i)so) 1 ( 7 



d 



- \^±{l + {q-l)so) j , (107) 

i.e., P'{d) is Poissonian with parameter 

7' -^(1 + (9-1)50). (108) 

This allows the following interpretation of (|103|l . H104|l . Let us postulate 
that rare graphs dominating the distribution ((S)) consist of independent 
vertices with effective degree distribution P' (d) as given by (|107|l and that 
the probability for a vertex to belong to the giant component is given by 
P'(in) = gso/(l + {q — l)so)- Accordingly the probability not to belong to 
the giant component is P'(out) = 1 — P'(in) = (1 — so)/(l + {q — l)so)- 
Repeating then the simple Bayes argument of section IST^ we can reproduce 
the correct results for the degree distribution inside and outside the giant 
component as given by (|103|l and I|1U4I) . In this interpretation the shift 
from P{d) as given by ^ to P'{d) accounts for the energetic contribution 
to the probability of a rare graph whereas the replacement of P(in) = Sq 
by P'(in) stands for the entropic one. 



4.6 The distribution of small component sizes 

The result pU3|) for the degree distribution outside the giant component 
allows to calculate the complete size distribution of non-extensive compo- 
nents tp{S,j,q). As noted in subsection 14.11 the small components are 
almost certainly trees, i.e. a component of size S = 0{1) involves S — 1 
edges. From the degree distribution (|103|l we get for the probability P to 
find among N{1 — so) vertices a set of S vertices that make 5—1 connec- 
tions with each other and none with the remaining ones to leading order 
in TV 

~i^(^(l-5„))%-f'-"-'». (110) 

Not all of these sets form trees however, since not all of them are connected. 
The number of (unlabeled) trees of S vertices is S^~-^ For the number 
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s s 

Figure 9: Distribution '0(5', 7, q) of the size S of non-extensive components 
in graphs with an atypical number of components. Left panel is for 7 = 0.25 
showing the results for q = 0.135 (full line and circles) and q = 2.72 (dashed 
Hue and squares) respectively. Right panel is for 7 = 3.0 and q — 5.29 (full 
line and circles) and q = 2.72 (dashed line and squares) respectively. Lines 
show the analytical result symbols represent results from numerical 

simulations. 



of small components of size S per vertex (of the complete graph) we hence 
find 

V^(5,7,9) = ^5^-^P=^^ (j(l-^o)e-?(i-^"')'. (Ill) 

Fig.|5]compares this expression for ^p{S, 7, q) with results from numerical 
simulations described in sectional The agreement is again very good except 
for relatively large components with correspondingly small probabilities, 
ip{S, 7, q) ^ 10~^, where the statistical error in the simulation data prevents 
a meaningful comparison. 

For q = I the result 11111) for '4'{S, 7, q) reduces to (jS^ after using (gHJ. 
Moreover comparison of Ullll) with H59I) shows 

V^(5,7,g) = (l-5o(7,9))0*(5,7') (112) 

with 

7' = J(l-so(7,9)). (113) 

Hence in an atypical graph of the discussed type the vertices not belonging 
to the giant component can be considered to be a typical random graph of 
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N' ^ N{1 — So) vertices with effective connectivity parameter 7'. Multi- 
plying by S and summing over S we find 

7' = E^^(Ve-7')^ (114) 
s 

implying 7' < 1 We have hence always s*{Y) = 0. On the one hand this 
implies that the outside vertices are not able to build up a giant component 
of their own and therefore shows the self-consistency of our assumption that 
there is only one giant component in rare random graphs of the considered 
type. On the other hand it allows to easily derive results for g ^ 1 from the 
corresponding ones for q — 1, = 0. For the total number of components 
per vertex we find, e.g., from H112() and (|47|l 

s s 
= (l-^o)c*(7') = (l-^o) (1-^) (116) 
= (l-s„) (1-^(1 -5o)) (117) 

reproducing H3U|I . Similarly one may rederive the expression H45|) for the 
second moment of i/;(S', 7, g) which, however, follows more directly by dif- 
ferentiating (|114|l with respect to 7'. 

We finally note that the evolution equations for the various probability 
distributions employed in this section are correct in the large N limit. For 
finite N, care must be paid to the fact that the addition of an edge or a 
vertex slightly changes the degrees of the vertices giving rise to 0{1/N) cor- 
rections. Similar corrections occur in the application of the cavity method 
to spin glasses as discussed in chapter V in 14 . These additional terms 
are, however, irrelevant in the calculations presented above. 

5 Numerical simulations 

In order to check the analytical results described above, we have performed 
Monte Carlo simulations to generate graphs with atypical numbers of com- 
ponents. We have performed simulations for graph sizes between = 50 
and N = 10000, the results shown are for N = 1000, where most simu- 
lations were performed. The rare-event algorithm |24j used works in the 
special case here as follows: 

•^This result may be obtained independently by using 1241 in I I113I to get 

7' = (1 — so)/{qso) ln(l -I- qso/{l — so)) < 1 for all ij > and all so with < so < 1- 
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One starts with an initial graph G, e.g. a typical random graph with 
connectivity 7 and calculates the number of components C{G). The simu- 
lation is performed for a given value oiq^. Each Monte Carlo step consists 
of the following steps: 

• A trial graph G' is generated: 

— Copy G to G' 

— Select one vertex i in G' randomly 

— Delete all edges adjacent to i 

— For all other vertices j ^ i generate edge with probability 

l/N 

• Calculate number of components C{G') 

• Accept G' as new configuration G with the Metropolis probability 
min{l,g<^«^')-<^(G)} 

In equilibrium, this procedure generates graphs distributed according 
to the probabihty distribution P{G;j,q,N) as given by Q. Equilibra- 
tion was established in the following way. Two runs were started with 
two different initial configurations. One was a typical random graph the 
other one for g < 1 (g > 1) a graph having minimal (maximal) number 
of components, i.e. a fully connected graph (a graph without edges). In 
the simulation the number of components C (t) was recorded as a function 
of Monte Carlo sweeps (MCS) t. The system was considered to be equi- 
librated after time to, if G{to) agrees within the typical fluctuations for 
the two starting configurations. For N = 1000 this was the case for all 
values of q after to = 20 MCS (7 < 3) respectively = 50 MCS (7 = 3). 
Hence the system equilibrates very quickly and does not show any sign of 
glassy behaviour. The average value of C found in the simulation depends 
on the value of q. For values q < 1 the average number of components is 
preferentially small, while for g > 1 it is high. After equilibration, we have 
taken every to MCS graphs for analysis, 10^ graphs for each value of q. 
This allows to obtain various quantities, as the degree distributions or size 
distributions of components, as a function of q. 

In order to obtain numerical results for w(c, 7) we need to determine 
P(C; 7, N) for all values C G [1, N]. Since simulations for one given value 
of q are dominated by graphs with number of components close to the 
typical number Nc{'j,q) corresponding to this value of q, simulations at 

*The parameter q corresponds to the temperature T used in Ref. ^24; via 
q = exp(l/r). The number of components C corresponds to the energy H via 
C = -sign(T)H. 
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various values of q have to be combined [23 ■ To this end one records 
during the simulation for each value of q the biased probability distribution 
P{C;"f,q,N) = X;G^(G';7,g,iV)5(C,C(G)). From © and @ one finds 
for each q 

P{C- 7, N) ^ q-^ Z(7, q, N) P{C- 7, g, N). (118) 

In order to extract from this relation P(C; 7, N) for values of C around 
Nc{'j,q) 7^ Nc*, we still have to determine Z{'y,q,N). This in turn can 
be done by starting from q = I, where the value ^(7, 1, N) = 1 is known. 
For values of q close to q = 1, the measured ranges of the distributions 
P(C; 7, 1, N) and P(C; 7, q, N) overlap with each other and ^(7, q, N) can 
be obtained from matching both distributions in this overlapping range. 
This procedure can be iterated to obtain Z{j, q, N) for values of q differing 
by an increasing amount from the starting point q = 1. Using 1)118(1 the 
complete distribution P{C;^, N) can be determined. In our simulations 
using N = 1000 and 7 = 0.25, 1, 2, 3 between 22 and 27 different values of 
q where sufficient to obtain P{C;'-f,N) and therefore w(c, 7) over the full 
range. 

6 Conclusion 

In the present paper we have investigated large deviation properties in 
ensembles of Erdos-Renyi random graphs. In particular we have studied 
graphs that are atypical with respect to their number of components. We 
have shown that several of their properties such as their probability, the 
relative size of their giant component, as well as the second moment of 
the distribution of component sizes can be obtained from the Legendre 
transform with respect to lug of the mean-field free energy of the g-state 
Potts model. This generalizes the well-known connection between typical 
properties of random graphs and the q ^ I limit of the Potts free en- 
ergy. Therefore this free energy conveys also interesting information about 
the random graph ensemble for values of g 7^ 1. In particular the well 
known first order phase transition in the mean-field Potts model for q > 2 
gives rise to a non-convex part in the logarithmic probability of the graphs 
corresponding to a bimodal probability distribution P{C; 7, g, N). 

In a second part we have rederived these results without recourse to 
the Potts model by requiring the "statistical stability" of the random graph 
ensemble under the insertion of an additional vertex or edge. This approach 
is made possible by the mere existence of the thermodynamic limit in which 
the number of vertices N tends to infinity. Besides reproducing the results 
obtained previously we have also pointed out some subtleties of this method 
when applied to exponentially rare configurations. Moreover we obtained as 
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additional results the complete degree distribution and the size distribution 
of non-extensive components in atypical graphs. 

Our analytical findings describing the limit in which the number N of 
vertices of the graphs tends to infinity are in very good agreement with 
numerical simulations using a rare-event algorithm for the case N = 1000. 

It is well known that the typical properties of Erdos-Renyi random 
graphs with fixed number of edges and fixed probability of edges are equiv- 
alent. This equivalence does, however, not carry over to the large deviation 
behaviour. In fact in an ensemble with fixed number of edges there is no 
energetic contribution to the probability of a rare graph and the large de- 
viation characteristics will be rather different from those studied in the 
present paper. 

Further work to improve our understanding of the relationship between 
the two processes (one more edge or one more vertex) would be useful. 
It would also be interesting to see how the microscopic approach may be 
generalized for the study of graphs which are atypical with respect to other 
properties than their number of components as, e.g. their number of vertex 
covers ^U], their average degree or the size of their giant component, where 
the connection to the Potts model cannot be used. After completion of 
this work we became aware of another very recent application of the cavity 
method to characterize certain properties of rare random graphs [2^1 ■ 



A Appendix 

In this appendix we give some intermediate results for the derivation of 
the Potts free energy p5|l . see also USl- The explicit determination of the 
partition function is possible since the energy function H12f) depends on the 
configuration of spins {ai} solely through the fractions 

1 ^ 

of variables ai equal to a. Clearly 

5^x(a,{aa) = l (120) 

for all {(7i}. The energy 1|12|1 may then be rewritten as 

EiW^}) = - Y E i^'}))' -NhY^Ua x{a, {a.}) + 0(1), (121) 
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and the partition function becomes to leading order in N 

{(T,} V <T a ) 

= V ^ exp I — Va;(CT)^ +/3iV/iVM^ x((t) I 

= / n ^^('^) ^'^p ( ^ 

^ <T = V 



a^Cc)^ + Iter a;((T) — a;(cr) In x(cr) 



The sum and the integral over a; (a) are restricted to the normalized sub- 
space defined by p2U|) . In the limit of large N the integral may be evaluated 
by the Laplace method. The Potts free energy H14() then reads 



/(/3, /i, g, {u^}) = extr 



(122) 

We will need explicit results for the free energy and its derivatives for ft, = 
only. A suitable ansatz to perform the extremezation in l|122|l for this case 
is 



x(0) = i(l + (g-l)so) 

x{a) = -(1 - So) 
9 



for cr 7^ 



(123) 
(124) 



with the yet undetermined parameter sq- This ansatz allows for a possi- 
ble spontaneous breaking of the Potts symmetry at low temperature and 
automatically fulfills the normalization (|120|l . It gives rise to 



/(/3, q) = extr 



1 g-1 

2q 2q 



In q 



-llSS-llfl ln(l + (, - l),so) + ^(1 - .^o) ln(l - .„) 
(3q (3q 



, (125) 



which is identical with H15|l . Differentiating the expression in the brackets 
in (|125|l with respect to s we find for the extremum value so{P,q) the 
equation 

„l3so _ 1 + (g - 1)S0 
— 1 — ' 

1 - So 

reproducing ((T?)|l . 



(126) 
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